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C^ ' Abstract. Starting from an axiomatic perspective, /Jwciuation geomeirj; is developed as a coun- 

terpart approach of inference geometry. This approach is inspired on the existence of a notable 

|/~\ ^ analogy between the general theorems of inference theory and the the general fluctuation the- 

orems associated with a parametric family of distribution functions dp{I\9) = p{l\9)dl, which 
describes the behavior of a set of continuous stochastic variables driven by a set of control param- 
eters 9. In this approach, statistical properties are rephrased as purely geometric notions derived 

P^ ' from the Riemannian structure on the manifold Mg of stochastic variables /. Consequently, 

this theory arises as an alternative framework for applying the powerful methods of differential 

geometry for the statistical analysis. Fluctuation geometry has direct implications on statistics 

r^ . and physics. This geometric approach inspires a Riemannian reformulation of Einstein fluctuation 

theory as well as a geometric redefinition of the information entropy for a continuous distribution. 
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1. Introduction 
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*^ . Inference theory supports the introduction of a Riemannian distance notion [1] : 

I> ■ dsl^gapiejde^de^ (1) 



to characterize the statistical distance between two close members of a generic parametric family 
of distribution functions: 

dp{i\9) = p{i\e)di. (2) 

Here, the metric tensor ga/siO) is provided by the so-called Fisher's information matrix [2- The 



?H ' existence of this type of Riemannian formulation was pioneering suggested by Rao [5], which is 



hereafter referred to as inference geometry^ Inference geometry provides very strong tools for 
proving results about statistical models, simply by considering them as well-defined geometrical 
objects. As expected, inference theory and its geometry have a direct application in those physical 
theories with a statistical apparatus as statistical mechanics and quantum theories. A curious 
example is the so-called extreme physical information principle, proposed by Frieden in 1998, which 
claims that all physical laws can be derived from purely inference arguments [1] . Inference geometry 

X This approach is referred to as Riemannian geometry on statistical manifolds in the literature. However, the 
denomination inference geometry is employed here to avoid the ambiguity with fluctuation geometry, which is also 
a Riemannian geometry on statistical manifold. 
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has been adapted to the mathematical apparatus of quantum mechanics [5] [SJ [71 |H]- In fact, 
modern interpretations of uncertainty relations are inspired on their arguments [9l ll0lfTl] . Inference 
geometry has been successfully employed in statistical mechanics to study phase transitions [121 HI] , 
as well as in the framework of so-called thermodynamics geometry [14j . 

The goal of this paper is to show that the parametric family of distribution functions ([2]) 
supports the introduction of an alternative Riemannian distance notion: 

ds^ ^gij{I\9)dPdP, (3) 

which characterizes the statistical distance between two close sets of continuum stochastic variables 
/ and I + dl for fixed values of control parameters 9. This geometrical approach, hereafter 
referred to as fluctuation geometry, is inspired on the existence of a notable analogy between the 
general theorems of inference theory and some general fiuctuation theorems recently proposed in 
the framework of equilibrium classical fiuctuation theory [151 HSl [13 HH] ■ The main consequence 
derived from this analysis is the possibility to rephrase the original parametric family of distribution 
functions ^ in terms of purely geometry notions. This connection enables the direct application 
of powerful tools of differential geometry for the analysis of their absolute statistical properties;^ 

The paper is organized as follows. For the sake of self-consistence, the next section will be 
devoted to discuss the main motivation of the present proposal: the analogy between inference 
theory and fluctuation theory. Afterwards, it will be developed an axiomatic formulation of 
fluctuation geometry. Firstly, the postulates of fiuctuation geometry are presented in section [31 
Then, their consequences are considered in section |4] to perform a geometric reinterpretation of 
the statistical description. Section [5] is devoted to discuss two simple application examples of 
fluctuation geometry. Implications of the present approach in some statistics and physics problems 
will be analyzed in section |6l as example, a reconsideration of information entropy for a continuous 
distribution and the development of a Riemannian reformulation of Einstein fluctuation theory. 
Final remarks and open problems will be summarized in section [71 

2. Motivation 

Let us start from the parametric family of distribution functions ([2]) , which describes the behavior 
of a set of continuous stochastic variables / driven by a set 9 of control parameters. Let us denote by 
A4g the statistical manifold constituted by all admissible values of the stochastic variables I that are 
accessible for a given value 9 of control parameters, which is hereafter assumed as a simply connected 
domain. Moreover, let us denote by V the statistical manifold constituted by all admissible values 
of control parameters 9 (each point 9 € V represents a given distribution function) . The parametric 
family of distribution functions ^ can be analyzed from two different perspectives: 

• To study the fluctuating behavior of stochastic variables I ^ Me-: which is the main interest of 
fluctuation theory [17] : 

• To study the relationship between this fluctuation behavior and the external control described 
by the parameters 9 G V, which is the interest of inference theory ^. 

§ Tensorial formalism of Riemannian geometry allows to study the absolute geometric properties of a manifold M 
using any of its coordinate representations TZi. A relevant example of absolute property is the curvature of the 
manifold M , which is manifested in any coordinate representation TZj . In general relativity theory, the curvature of 
space-time M* is identified with gravitation interaction. The effects of gravitation are absolute or irreducible, while 
the effects associated with the inertial forces are reducible. In fact, the existence of inertial forces crucially depends 
on the reference frame, that is, the specific coordinate representation of the space-time M*. 
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2.1. Fluctuation theory 

Let us admit that the probabihty density p{I\0) is everywhere finite and difFerentiable, and obeys 
the following conditions for every point /& located on the boundary dAie of the statistical manifold 
Me, h € dMe: 

hm p{I\e) = lim ^p(/|0) = 0. (4) 

The probability density p{I\0) can be considered to introduce the differential forces rii{I\9): 

^.Ul^)=-^logp(/|0). (5) 

By definition, the differential forces rji{I\6) vanish in those points where the probability density 
p{I\6) exhibits its local maxima or its local minima. The global (local) maximum of the probability 
density can be regarded as a stable (metastahle) equilibrium points J, which can be obtained from 
the following stationary and stability conditions: 

- Aiogp(/» ^ 0, --^logpim > 0, (6) 

where A^ >- denotes that the matrix Aij is positive definite. In general, the differential forces 
rii{I\9) characterize the deviation of a given point / G Aig from these local equilibrium points. 
Analogously, it is convenient to introduce the response matrix Xy(-^l^): 

x^Am^dMm. (7) 

where diA = dA/dP, which describes the response of differential forces rii{I\9) under an 
infinitesimal change of the variable P . 

As stochastic variables, the expectation values of the differential forces rji = r]i{I\9) identically 
vanish: 

{V^) - 0, (8) 

and these quantities also obey the fundamental and the associated fluctuation theorems [17j : 

{n^SP) = si, (9) 

{X^J) = imVj) , (10) 

where S^ is the Kronecker delta. The previous theorems are derived from the following identity: 

(d^Aim) = {r^^imAiim (n) 

substituting the cases A{I\6) ~ 1, /* and rji, respectively. Here, A(I) is a differentiable function 
defined on the continuous variables / with definite expectation values (^dA{I\9)/dP) that obeys the 
following the boundary condition: 

hm A{I)p{I\9) = 0. (12) 

i^ib 

Moreover, equation ()lip follows from the integral expression: 

^ilj^pil\9)dl^<f p{I\9)v={I\9)-d^,- f v\I\9)^P^dI (13) 

Me "^ JdMe J Me '^^ 

derived from the intrinsic exterior calculus of the statistical manifold Mg and the imposition of 
the constraint v^{I\9) = 6lA{I\9). It is easy to realize that the identity (|5]) and the associated 
fluctuation theorem (jlOp are just the stationary and stability equilibrium conditions ^ written in 
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term of statistical expectation values, respectively. In particular, the positive definite character of 
the self-correlation matrix Mij{9) = {rji{I\9)rjj{I\9)) implies the positive definition of the matrix: 

Remarkably, the fundamental fluctuation theorem ([9]) suggests the statistical complementarity of 
the variable /* and its conjugated differential force rji — rji{I\9). Using the Schwartz inequality 
{6A5B) < {5 A?) (^SB^'), one obtains the following inequality: 

APArj^ > 1, (15) 



where Ax = ^J {Sx^) is the statistical uncertainty of the quantity x. Clearly, this last inequality 
exhibits the same mathematical appearance of Heisenberg's uncertainty relation AqAp > h. 
Recently, this result was employed to show the existence of uncertainty relations involving 
conjugated thermodynamic quantities |15[ 116] . Equation (J15l) can be generalized considering the 
inverse M^\9) of the self-correlation matrix of the differential forces Mij{9) = {rji{I\9)rjj{I\9)) . 
Denoting by C^^ (9) = (5P5I^^ the self-correlation matrix of the stochastic variables /, it is possible 
to obtain the following matrical inequalities: 

&^{9)-M'^{9)>Q- (16) 

This last inequality is directly obtained from the positive definition of the self-correlation matrix 
K^i{9) = {.r{I\9)J^{I\9)) of the auxiliary quantities .r{I\9) ^ 6P + M'^ {9)r],{I\9). Accordingly, 
the self-correlation matrix C*^ (9) of stochastic variables / is inferior bound by the inverse M'^ (9) 
of the self-correlation matrix of the differential forces rji . 

2.2. Inference theory 

Inference theory can be described as the problem of deciding how well a set of outcomes I = 
{/(-'^\ /(^\ ...j(™)} obtained from independent measurements fits a proposed distribution function 
dp{I\9) [2]. This question is fully equivalent to infer the values of control parameters 9 from this last 
experimental information. To make inferences about control parameters, one employs estimators 
0a _ ^"(x)^ that is, functions on the outcomes I G A^™, where M™ = Me® Me . . .®Me (m-times 
the external product of the statistical manifold Me)- The values of these functions pretend to be 
the best guess for 9°'. 

Let us admit that the probability density p{I\9) is everywhere differentiable and finite on the 
statistical manifold V of control parameters 9. Let us start introducing the statistical expectation 
values {A{I\9)) as follows: 

{A{I\9))= [ A{I\9)Q{I\9)dI, (17) 

where dl = dl^^^dl^'^\..dl^™'^ and q{I\9) is the so-called likelihood function: 

g{I\e)=l[p{I^^^\9). (18) 

i=l 

Taking the partial derivative da — 8/89°' of Ea. ((T7|) . one obtains the following mathematical 
identity: 

{8^A {X\9)) ~ 8a {A {X\ 9)) = {A {X\ 9) v^ (I| 9)) , (19) 
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where: 

vc.^v^{i\e)^-^\ogp{i\e) (20) 

are the components of the score vector v = {va}- Substituting A{X\9) ~ 1 into Eq. (|19p . one 
arrives at the vanishing of the expectation values of the score vector components: 

{v^{i\e))^o. (21) 

Let us consider now any unbiased estimator A{X\9) = 9°'{I) of the parameter 6°', (6'"(T)) — 6°', as 
well as the score vector component A{I\9) = u^(I|0). Substituting these quantities into identity 
p9)) . it is possible to obtain the following results: 

{S9'^{I)v0{I\9)) = -S"^, (22) 

{df,v^{I\9)) = {v^{I\9)vp{I\9)). (23) 

It is easy to realize that the identities (PT|) and (P5)) can be regarded as the stationary and stability 
conditions of the known method of maximum likelihood estimators 2 . According to this method, 
the best values of the parameters 9 should maximize the logarithm of the likelihood function g {I\9) 
for a given set of outcomes I. Such an exigence leads to the following stationary and stability 
conditions: 

- air i«g ^ (^1^") - 0' - a^ i°g ^ (^1^") >- 0' (24) 

which should be solved to obtain the maximum likelihood estimators 9^^^ — 9°'{T). On the 
other hand, the identity (P^ also suggests the statistical complementarity between the estimator 
ga _ goi^2'^ ^^^ j|-g conjugated score vector component Va- Using the Schwartz inequality, one 
obtains the following uncertainty-like inequality: 

A9"Ava > 1. (25) 

This result can be easily improved introducing the inverse matrix g°'^{9) of the self-correlation 
matrix gapiO) = {vaiI\9)vi3iX\9)) and the auxiliary quantity X"' = 59°" — g'^^vp. Thus, one can 
compose the positive definite form: 

((A„X")') = (X"X'5)A„A/3>0, (26) 

which leads to the positive definition of the matrix: 

-g"^i9)y0. (27) 

This last inequality is the famous Cramer-Rao theorem of inference theory [2l |3] that imposes an 
inferior bound to the efficiency of unbiased estimators 0" , where the self-correlation matrix g^p (9) : 

9c.p{0) = {vamO)v0{I\9)) (28) 

is the Fisher's information matrix referred to in the introductory section. 
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Inference theory 


Fluctuation theory 


viie) = -Ve\ogpiI\9) 

{v{i\e)) = o 

{Ve-viI\9)) = {viI\9)-v{I\e)) 


viI9) = -Vi\ogpiI\9) 

{rjil\9)-51) = li 
{Vi-rjim) = {,^{I\9)-7j{I\9)) 



Table 1. Analogy between inference theory and fluctuation theory. 



2.3. Analogy between inference theory and fluctuation theory 

Fluctuation theory and inference theory provide two different but complementary characterizations 
for a given parametric family of distribution functions dp{I\9). Formally, these two statistical 
frameworks can be regarded as dual counterpart approaches because of the great analogy among 
their main definitions and theorems, as clearly evidenced in table [T] To simplify the notation, 
it was introduced here the gradient operators di — >■ V/ and da -> Vg, the diadic products 
A • B = AiBjQ'- ■ e^ and £, ■ tp = £,a'4'i3£" ■ e'^ and the Kroneker delta (5* — >■ 1/ and S'^ -^ le- 

Remarkably, the analogy between fluctuation theory and inference theory is uncomplete in 
regard to their respective geometric features. The parametric family dp{I\9) is expressed in the 
representations TZj and TZg of the statistical manifolds A^g and V, respectively. Equivalently, 
the same parametric family can be also rewritten using the representations TZq and TZ^, of the 
manifolds Aie and V, which implies the consideration of the coordinate changes Q{I) : TZi -^ TZq 
and i'{9) : TZg — ^ 7?.i/. Under these parametric changes, the Fisher's inference matrix (P5)l behaves 
as the components of a second rank covariant tensor: 

, , d9"d9f^ 
9.sM = -Q-^^^3.m- (29) 

The existence of these last transformation rules guarantees the invariance of the inference distance 
notion ([Ij. Thus, the statistical manifold V of control parameters 9 can be endowed of a Riemannian 
structure. The relevance of the distance notion ^ can be understood considering the asymptotic 
expression of the distribution function of the efficient unbiased estimators 9(,ff (I) '. 



dQ'^{^\9)= / 6 ^-9eff{I) g{l\9)dl 
Jmt 



(30) 



when the number of outcomes m is sufficiently large. Since gap{9) (x m, one can obtain the following 
approximation formula: 



dg"(i?|6l)~exp 



1 



5a/3(0)Al?"A^^ 



9ocfi{9) 



27r 



d§, 



(31) 



where A-i?" — i!)" — 9". Accordingly, the distance notion ([T]) provides the distinguishing probability 
between two close distribution functions of the parametric family dp(I\9) through the inferential 
procedure. 

The analogy between fiuctuation theory and inference theory strongly suggests the existence 
of a counterpart approach of inference geometry in the framework of fiuctuation theory, that is, the 
existence of a Riemannian distance notion ([3|) to characterize the statistical separation between 
close points / and I + dl € Me- Unfortunately, the underlying analogy is insufficient to introduce 
the particular expression of the metric tensor gij{I\9). For example, it is easy to check that the 
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fluctuation theorems ([5|)- p^ can be also expressed in the new coordinate representation 7?,e [18, . 
as example, the associated fluctuation theorem: 

{d.,vmO)) = {v^{Q\0)vAQm , (32) 

where di = d/d& and rii{Q\0): 

m{Q\0) = '^,\og p{Q\6). (33) 

However, the self-correlation matrix Mij{9) = {rii{Q\9)rij{Q\9)) associated with the new coordinate 
representation TZq is not related by local transformation rules to its counterpart expression 
Mij{9) — {rji{I\9)rij{I\9)) in the old coordinate representation TZi. The self-correlation matrix 
of the differential forces Mij{9) = {rji{I\9)rij{I\9)) is a matrix function defined on the statistical 
manifold V of control parameters 9, while the metric tensor gij{I\9) is a tensorial entity defined on 
the statistical manifolds M.9 and V. As expected, the definition of the metric tensor gij{I\9) cannot 
involve integral expressions over the manifold A^ 9 as the case of inference metric tensor ()28|) . 

3. Fluctuation geometry 

Fluctuation geometry can be formulated starting from a set of axioms that combine the statistical 
nature of the manifold M.g and the notions of differential geometry. These axioms specify the way 
to introduce the metric tensor gij{I\9) associated with the parametric family ([2|). This section is 
devoted to discuss these axioms and their most direct consequences. 

3.1. Postulates 

Axiom 1 The manifold of the stochastic variables Mg possesses a Riemannian structure, that 
is, it is provided of a metric tensor gij {I\9) and a torsionless covariant differentiation Di 
that obeys the following constraints: 

Dug^j {I\9) = 0. (34) 

Definition 1 The Riemannian structure on the statistical manifold Aig allows to introduce the 
invariant volume element as follows: 



di^im 



9^J im 



dl, (35) 



277 

where \gij {I\9)\ denotes the absolute value of the metric tensor determinant. 

Axiom 2 There exist a differentiate scalar function S {I\9) defined on the statistical manifold Mg, 
hereafter referred to as the information potential, whose knowledge determines the distribution 
function dp {I\9) of the stochastic variables I G A4g as follows: 

dp {I\9) = exp [S {I\9)] dfi{I\9). (36) 

Definition 2 Let us consider an arbitrary curve given in parametric form I(t) G Aig with fixed 
extreme points I{ti) = P and I{t2) — Q. Adopting the following notation: 
H dP{t) 



(37) 



As=/ ^g,,[I{t)\9]P{t)lHt)dt. (38) 

Jti 



dt ' 
the length As of this curve can be expressed as: 

rt2 
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Definition 3 It is say that the curve I{t) £ Me exhibits a unitary affine parametrization 

when its parameter t satisfies the following constraint: 

9,,[I{t)\d]P{t)P{t) = l. (39) 

Definition 4 A geodesic is the curve Ig (t) with minimal length |^0[ ) between two fixed arbitrary 
points {P,Q) G Me- Moreover, the distance 'De{P,Q) between these two points {P,Q) is given by 
the length of its associated geodesic Ig (t) : 



De{P, Q) - j^ ^9:j [Ig{t)\9] Pg (t) Pg {t)dt. (40) 

Definition 5 Let us consider a differentiate curve I{t) G M.0 with an unitary affine 
parametrization. The information dissipation $(i) along the curve I(t) is defined as follows: 

,, dS[I(t)\9] , , 

*w = -^jr^- (41) 

Axiom 3 The length As of any interval (^1,^2) of an arbitrary geodesic Ig(t) G Aie with a unitary 
affine parametrization is given by the negative of the variation of its information dissipation A$(i); 

As = -A$(t) = $(ii)-$(t2). (42) 

Axiom 4 If Me is not a closed manifold, dMe 7^ 0, the probability density p{I\0) associated with 
distribution function i36]} vanishes with its first partial derivatives for any point on the boundary 
dMe of the statistical manifold Me- 

3.2. Analysis of axioms and their direct consequences 

Axiom 1 postulates the existence of the metric tensor gij{I\9) defined on the statistical manifold 
Me. Even, this axiom specifies the Riemannian .structure of the manifold Me starting from the 
knowledge of the metric tensor gij{I\9), e.g.: the covariant differentiation Di and the curvature 
tensor Rijki{I\9). Equation p4p is an strong constraint of Riemannian geometry that determines 
a natural affine connections F^ for the covariant differentiation Di, specifically, the so-called Levi- 
Civita connection ^W\ : 

The knowledge of the affine connections T^, allows the introduction of the curvature tensor 
R\^^ = R\.^{I\9) of the manifold Me- 

r) r) 

which is also derived from the knowledge of the metric tensor gij(I\9) and its first and second partial 
derivatives. 

Axiom 2 postulates the probabilistic nature of the manifold Me, in particular, the existence 
of the distribution function dp{I\9) and the information potential S{I\9). Equivalently, this axiom 
provides a formal definition for the information potential S{I\9) when one starts from the knowledge 
of the parametric family ([2]). The probability density p{I\9) of the parametric family ([2]) obeys the 
transformation rule of a tensorial density: 

(45) 



p{e\9) = p{i\9) 



dl 
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under coordinate change 6(/) : TZi — > TZe of the statistical manifold Me- The covariance of the 
metric tensor gij {I\9): 

i^ TTfi i^ TTl 

9vm&) = ^^QQj9rnnim (46) 

implies that the pre-factor of the invariant volume element ([55t also behaves as a tensorial density: 

(47) 

Admitting that the metric tensor determinant |(7ij(/|6')| is non- vanishing everywhere, it is possible 
to introduce probability weight cu{I\0): 



9ij ie\o) 

2tt 


■1/ 


9^J im 

2tt 


di 



io{I\e) = piI\0W\27rg^Hl\d)l (48) 

which represents a scalar function defined on the manifold Me. Since the statistical manifold Me 
possesses a Riemannian structure, the integration over the usual volume element dl can be replaced 
by the invariant volume element dfi{I\9). This consideration allows to rephrase the parametric 
family ([2]) in the following equivalent representation: 

dp{I\0) = iu{I\9)dfi{m, (49) 

which explicitly exhibits the covariance of the distribution function under the coordinate 
reparametrizations of the manifold Me- The information potential S{I\9) is defined by the 
logarithm of the probability weight uj{I\9): 

S{I\e)^logoj{I\9), (50) 

which also represents a scalar function defined on the statistical manifold Me- As discussed in 
section [6l the negative of the information potential S{I\9) can be regarded as a local invariant 
measure of the information content in the framework of information theory. Additionally, S{I\9) 
can be identified with the scalar entropy of a closed system in the framework of classical fluctuation 
theory [18]. Given the probability density p{I\9), the information potential S{I\9) depends on the 
metric tensor gij{I\9) of the statistical manifold Me- Axiom 1 postulates the existence of this 
tensor, but its specific definition is still arbitrary. Axiom 3 eliminates such an ambiguity. In fact, 
it establishes a direct connection between the distance notion (j3|) and the information dissipation 
(PIj) . or equivalently, between the metric tensor gij{I\9) and the information potential S{I\9). 

Theorem 1 The metric tensor gij{I\9) can be identified with the negative of the covariant 
Hessian 'Hij{I\9) of the information potential S {I\9): 

g„ {I\9) = ~mj{I\9) = ~D,D,S {I\9) . (51) 

Corollary 1 The information potential S{I\9) is locally concave everywhere and the metric tensor 
gij{I\9) is positive definite on the statistical manifold Me- 

Proof. The searching of the curve with minimal length PO)) between two arbitrary points (P, Q) 
is a variational problem that leads to the so-called geodesic differential equations [19) : 

i^{t)DJi{t) = ilit) + rj„„ [i,{tm i^{t)i-{t) = 0, (52) 

which describes the geodesies Ig{t) with a unitary affine parametrization. Equations (|^T|) and (|42p 
can be rephrased as follows: 
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Considering the geodesic differential equations ([5^. the constraint ([5^ is rewritten as: 

ds^ 3 dp 3 ^ dPdD 9 9 \ dPdii '' dP J ' ^ ' 

which involves the covariant Hessian Hij : 



H., = Ai?,5 = ,j-^-rf,^. (55) 



d^S fc dS 

dPdV 'J dP ■ 
Combining equations (|54 |) - ([55l) and the constraint (p9| . one obtains the following expression: 

{g^,+H^pilil^Q. (56) 

Its covariant character leads to Eg. dSTI) . Corollary [l] that is, the concave character of the 
information potential S{I\9) and the positive definition of the metric tensor gij{I\9) are direct 
consequences of equation (1531) . ■ 

Corollary 2 The metric tensor gij = gij {I\9) can he obtained from the probability density 
p = p {I\9) through the following set of covariant second-order partial differential equations: 

d^logp , ^k 9logp a fc . , 

^'' " " dPdD ''~dP~ dP^^ ~ '■' *='■ ^ ' 

The admissible solutions for the metric tensor gij should be finite and differentiable everywhere, 
including also, the boundary dMg of the statistical manifold Me- 

Proof. Expression (|57p is derived from equation (j5ip rewriting the information potential as 
S {I\9) = log p{I\9) — log \/\gij {I\9) /27r| and considering the following identity: 



^, ^ d\og^\g.,{ I\9)/2T, 

ip I -' ~ Ql 



n,(/|g)^ "^"^^'^;^'"^^^"' , (58) 



which is a known property of the Levi-Civita connection ((33]). ■ 

Axiom 4 talks about the asymptotic behavior of the distribution function (|36p for any point 
p on the boundary dMg: 

d 

lim p(I\9) = lim TrrpilW) = 0. (59) 

These last conditions are necessary to obtain the general fluctuation theorems (l5))- (|10p reviewed in 
the previous section. Moreover, this axiom will be employed to analyze the character of stationary 
points (maxima and minima) of the information potential S{I\9). 



Remark 1 The boundary conditions ^59(1 are independent from the admissible coordinate 
representation TZj of the statistical manifold Aig. Moreover, the probability weight Ll){I\9) vanishes 
on the boundary dAig of the manifold A4g. 

Proof. This remark is a direct consequence of the transformation rule of the probability density 
([^5]) as well as the ones associated with its partial derivatives: 

dp(e\9) dP (dp(I\9) ,^,^, a , 99 1 99 "^ 

under a coordinate change 9(/) : TZi — !• TZq with Jacobian \dQ/dI\ finite and differentiable 
everywhere. Since the metric tensor determinant |gy(/|0)| is non-vanishing everywhere, Axiom 
4 directly implies the vanishing of the probability weight lo{I\9) on the boundary dM.g of the 
statistical manifold Mg. ■ 
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4. Geometric reinterpretation of the statistical description 



The question about the existence and uniqueness of solutions obtained from the problem (j57p cannot 
be fully analyzed in this work because of its complexity. This section is devoted to discuss some 
consequences derived from the existence of a given particular solution gij{I\9). 

4-.1. Gaussian representation 

Definition 6 The covariant components of the gradiental vector ^i {I\9) are defined from the 
information potential S {I\6) as follows: 

^, {i\e) = - A5 {i\e) = -ds {i\e) /dP. (6i) 

Using the metric tensor g'^ {I \9), it is possible to obtain its contravariant counterpart ijj''{l\6): 

i;^I\e)=g^^I\e)^,{I\e), (62) 

as well as its the square norm ip'^ — tp'^{I\9): 

^p'{i\e)^^'{i\e)^,ii\9). (63) 

Theorem 2 The information potential S{I\9) can be expressed in terms of the square norm of the 
gradiental vector as follows: 

S{I\9)=V{9)-^,P^I\0), (64) 

where V{9) is a certain function on control parameters 9, which is hereafter referred to as the 
gaussian potential. 

Proof. Let us introduce the scalar function V{I\9): 

V{I\9) = S{I\9) + ]^g^ni\9)ij,{I\9)i,,{I\9). (65) 

It is easy to verify that its covariant derivatives: 

Dt,V{I\9) = DkS{I\9) + i {Mmi^AmDkg''{m+ (66) 

^g'^m [i>,{I\9)Dui^j(I\9)+i^,{I\9)Dki^,{I\9)]) 

vanish as direct consequences of the metric tensor properties (p4)) and (jSTj) . as well as definition 
([6T|) of the gradiental vector ■0i(/|6'). Since the covariant derivatives of any scalar function are given 
by the usual partial derivatives: 

DkV{I\9) = -^V{I\9)^Q, (67) 

the scalar function V{I\9) can only depend on the control parameters 9: 

V{I\9) = V{9). (68) 

Mathematically speaking, the scalar function (|65p can be regarded as a first integral of the set of 
covariant differential equations ([57|. ■ 

Corollary 3 The value of information potential S{I\9) at all its extreme points derived from the 
stationary condition: 

V'^(/|6l) = (69) 

is exactly given by the gaussian potential V{9). 
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Corollary 4 The distribution function i36]) admits the following gaussian representation: 

dfi{i\e). (70) 



MlW)^^7^exp 



l^'im 



Here, the factor Z{9) is related to the gaussian potential V{9) as follows: 

v{e)^-\ogZ{e), (71) 

which is hereafter referred to as the gaussian partition function. 

4-. 2. Maximum and completeness theorems 

Theorem 3 The information potential S{I\0) exhibits a unique stationary point I in the statistical 
manifold Mg, which corresponds to its global maximum. 

Proof. The information potential S{19) should exhibit at least a stationary point / where takes 
place the stationary condition (I69|) . This conclusion follows from the vanishing of the scalar weight 
of distribution function: 

tj(/|6')=exp[5(/|6')] (72) 

on the boundary dMg, as well as its character nonnegative, finite and differentiable on the simply 
connected manifold Mg. Since the information potential S{10) is a concave function, its stationary 
points can only correspond to local maxima. Let us suppose the existence of at least two stationary 
points /i and I2 as well as the geodesic Ig{t) that connects these points. According to constraint 
(j53p . the information dissipation $(t) is a monotonous function along the curve Ig{t). Therefore, 
$(i) should exhibit different values at the stationary points /i and /2, which is absurdum since 
the information dissipation ^{t) identically vanishes for any stationary point of the information 
potential 5(7 16*): 

'^{t) = -r{t)ui{tm. (73) 

Consequently, there exist only one stationary point that corresponds with the global maximum of 
the information potential S{I\9). ■ 

Theorem 4 Any hyper-surface of constant information potential S{I\9) is just the boundary of 
a n-dimensional sphere S^{I,€) C Aig centered at the point I with global maximum information 
potential, where n is the dimension of the manifold Mg. Moreover, the information potential S 
depends on the radius I of this n-dimensional sphere as follows: 

S = V{9) - ]-e. (74) 

Proof. By definition, the vector field v^{I\9): 

Am = f^ (75) 

is the unitary normal vector of the hyper-surface with constant information potential S{I\9). It is 
easy to verify that the vector field v^{I\9) obeys the geodesic equations ([5^ : 

v\l\9)Duv\I\9) =. ^^ [51 - v\mv,{I\9)] = 0. (76) 
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Hence, v^{I\9) can be regarded as the tangent vector: 

^v%is\e)\0] (77) 



dliis\e) 



ds 

of geodesic family /g(s|e) with unitary afhne parametrization centered at the point / with 
maximum information potential S{I\9), Ig(s — 0|e) = /. Moreover, the constant unitary vector e 
parameterizes geodesies with different directions at the origin, Ig(s — 0|e) — e . The information 
dissipation $(s|e) along any of these geodesies is given by the negative of the norm of the gradiental 
vector: 

(]P(h\b) 

^G'^le) = ^V. [IA^\em = -~i:[Ig{s\e)\e]. (78) 

Considering equation (H^ . the norm ip{I\0) can be related to the length As of the geodesic that 
connects an arbitrary point / with the point / with maximum information potential, that is, the 
distance 1)q{I^I) between the points / and /: 

^{I\e)=^g{lJ). (79) 

According to the gaussian decomposition (|64[) . the hyper-surface with constant information 
potential S{I\0) is also the hyper-surface where the norm of gradiental generalized forces V'(^l^) is 
kept constant, that is, the boundary of a n-dimensional sphere S"'{I,£) centered at the point / with 
maximum information potential. ■ 

Corollary 5 The distribution function I136\} can be expressed in the following Riemannian 
gaussian representation: 



dp{I\0) = ^T^exp 



\em 



ifWO), (80) 



where £{I) = "Deil,!) is the separation distance between the points I and I. Consequently, the 
knowledge of the metric tensor gij{I\9) and the point I with maximum information potential S{I\6) 
fully determines the distribution function dp{I\9). 

Proof. Riemannian gaussian representation (|80|) is a direct consequence of replacing equation (|74| 
into equation ([5S)) . The radius £ of the n-dimensional sphere S'^{I,£) referred to in Theorem [4] 
and the invariant volume element dfj,{I\9) are purely geometric notions derived from the knowledge 
of the metric tensor gij(I\6) and the point / with maximum information potential S{I\9). Equation 
([50)1 evidences that all the statistical description associated with the distribution function ^ can 
be rephrased in terms of geometric notions derived from the Riemannian structure of the manifold 
Me- ■ 

Corollary 6 For points I close to the point I with maximum information potential S{I\9), the 
distribution function i36\) admits the following gaussian approximation: 



dp{I\9) ~ exp 



ig,,(/|0)ArAr 



g^Am 



dl. (81) 



2tt 

Proof. The separation distance £{I) — 'De{I,I) can be approximated as follows: 

£^{I)c^g,j{!\9)ArAr. (82) 

This last expression can be directly obtained from definition (|40)) . where A/* = P — P. In 
this approximation level, the normalization condition implies the following estimation for gaussian 
partition function Z{9) c^ 1. ■ 



Fluctuation geometry: A counterpart approach of inference geometry 14 

5. Application examples 

5.1. Fluctuation geometry of an one- dimensional statistical manifold Aie 

Let dp{I\9) be a generic parametric family defined on an one-dimensional manifold Aie- Let us also 
consider that the admissible values of the stochastic variable in the coordinate representation TZi 
belong to a certain real subset {Imim Imax) C M. Due to its general multidimensional character, the 
statistical manifold V of control parameters 6 could be a flat or a curved Riemannian manifold. A 
particular example with a great relevance in statistical and physical applications is the exponential 

dp{I\0) = exp [P{e) - 9°'Ao,{I) + B{I)] dl. (83) 

According to Amari's a-connections [T]: 



the statistical manifold V associated with the exponential family (|83|) is trivially flat when the 
connection parameter a = ±1. However, the statistical manifold V could be a curved manifold 
for other values of connection parameter a. An special case is cr = 0, which corresponds to the 
Levi-Civita connection (|33]) associated with the inference metric tensor gap{d) = —d'^P{0)/dO°'d9^. 
Without mattering about the geometry of the statistical manifold V, the one-dimensional statistical 
manifold Mg is always diffeomorphic to the one-dimensional Euclidean manifold E. Clearly, the 
curvature notion is only admissible for manifolds with dimension n > 2, and hence, the one- 
dimensional manifold Mo must exhibit a fiat geometry. As expected, this type of distribution 
functions represents the simplest application framework of fluctuation geometry. 

The invariant volume element psp of the one-dimensional manifold Me can be rewritten in 
term of the statistical distance (jS]) as follows: 



d^i{I\e) = y/gii{l\9)/2ndl = ds/^/2^:, (85) 

where gii{I\6) denotes the only component of the metric tensor. One can apply the Riemannian 
gaussian representation (I80p instead of performing the integration of the set of covariant partial 
differential equations ([57| . For convenience, let us firstly introduce the coordinate reparametrization 
s{I\9) -.Ui ^Us defined by the distance S)e(/|J) = i{I) as follows: 

.(m\- / -^oim for/</, 

'(^l^)-\ S)e(/|/) for/>/, (^^) 

where the metric tensor component gii{s\9) = 1. Using equations (|85l) and (1861) . Riemannian 
gaussian representation (j80p can be expressed in the coordinate representation TZ^ as: 

dpim - Piimi ^ ^^e-'^^' ^. (87) 

As discussed in [Appendix A[ the previous expression allows a straightforwardly derivation of the 
reparametrization function s{I\9). Introducing the cumulant distribution function p(I\9): 

Pirn = / dp{i'\e)di', (88) 

II Classical statistical ensembles as canonical and Gran canonical ensembles belong to the exponential family 1 183 II . 
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Figure 1. Dependence of the reparametrization function s{I\6) versus the cumulant distribution 
function p(I\d). Notice that the value corresponding to the maximum information potential 
s{I\9) = takes place when the cumulant function p{I\6) = 1/2. 



the reparametrization function s{I\9) is given by: 

Sim = *-' WW)] ■ 

Here, $~^(z) is the inverse of the function ^(z): 

$(z) = -j= / e"^"'ds = o (l + crf(z/V2 
with erf(z) being the error function: 



erf(z) 



dx. 



(89) 
(90) 

(91) 



The dependence between the reparametrization function s{I\9) and the cumulant distribution 
function p{I\0) is illustrated in figure [TJ By definition of the reparametrization function (|86|) . 
the point / with maximum information potential S{I\9) corresponds to the condition s{I\9) = 0. 
According to equation (j89p . the information potential S{I\9) exhibits its maximum value at the 
point I where the cumulant distribution function ((88)) reaches the value p{I\9) — 1/2. Moreover, 
the admissible values of the variable s belong to the entire real space M, — oo < s < +oo. 
The normalization of the gaussian distribution ([87]) implies that the gaussian partition function 
Z{9) = 1. Thus, the information potential S{I\6) is given by: 

S{I\9) = -s'^{I\9)/2, (92) 
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Figure 2. Comparison between probability density function p(I\d) (solitd lino) and probability 
weight Lo{l\Q') (dashed line) for some simple distribution functions: Panel a) A gaussian 
distribution defined on the one-dimensional real space R. Panel b) The superposition of 
two different gaussian distributions defined on the one-dimensional real space R. Panel c) a 
triangle-like distribution defined on the real segment [—a, a]. Panel d) A uniform distribution 
defined on the real segment [0,1]. Despite their different appearance, all these distributions are 
dijjermorphic, that is, they are equivalent from the viewpoint of fluctuation geometry. 



which is a non positive function that diverges at the boundary dAig of the statistical manifold Me 
(at the boundary points Imin and Imax in the coordinate representation TZi.). The only component 
of the metric tensor gii{I\6) can be expressed as follows: 

5ii(/|0)=2V(/|0)exp[s2(/|0)]. (93) 

For illustrative purposes, it is shown in figure [2] a comparison between the probability density 
function p{I\0) and the probability weight oj{I\0) = p{I\0)y/\2TTg^^I\e)\ = exp[5(/|e')] for some 
simple distribution functions dp(I\9). While the probability density p{I\0) can be a multimodal 
function in certain coordinate representations TZi (as the case shown in panel Ob), the probability 
weight uj{I\9) is always a monomodal scalar function as a consequence of Theorem 3. 

Summarizing, the present analysis demonstrates that any parametric family dp{I\9) defined on 
an one-dimensional statistical manifold Me can always be mapped onto the gaussian distribution 
function (|87p using the reparametrization function (|89l) . Consequently, all these distribution 
functions (as the ones shown in figure[2]) are diffeomorphic among them. In other words, all them can 
be regarded as an abstract gaussian distribution function defined on the Euclidean manifold E, but 
expressed in different coordinate representations TZji%. On the other hand, the relationship between 

^ The concept of diffeomorphic distribution functions will be considered in subsection 16.21 to discuss the notion of 
intrinsic differential entropy of a statistical manifold Mg. 
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the reparametrization function s{I\9) and the cuniulant distribution function p{I\0) evidences 
the purely statistical significance of the distance notion ([S]). As expected, fluctuation geometry 
establishes a direct correspondence between the geometrical description of the statistical manifold 
J^g and its probabilistic description. For example, the following geometrical and probabilistic 
inequalities: 

De{I\I) < e and |p(/|6i) - 1/2| < ierf(e/v^) (94) 

are fully equivalent. 

5.2. Generalization to the n-dimensional Euclidean manifold E" 

Let us suppose that the parametric family dp{I\9) can be factorized into independent distribution 
functions dp^^\P\6) for each stochastic variable P: 

dpii\e)^l[dp^-^\p\e). (95) 

i 

Accordingly, its associated n-dimensional statistical manifold jMe can be decomposed as the external 
product of the set of one-dimensional statistical manifolds {£g} as follows: 

Me=£e^£e---^£e, (96) 

and hence, A^g is diffeomorphic to the n-dimensional Euclidean manifold E". The results obtained 
in the previous subsection are straightforwardly extended to the present situation considering that 
the information potential S{I\9) is additive: 

n 

s{i\e) = Y,s^'Hm, (97) 

4=1 

while the metric tensor gij{I\6) is diagonal: 

n 

ds'^Y.9diy){dPf. (98) 

1=1 

Here, the functions S^^\P\6) and gii{P\6) are obtained from the probability densities p'^*)(/*|0) 
using equations dHHl), dHU), & and ((M|) . 

6. Implications of fluctuation geometry in statistics and physics 

6.L Comparison between fluctuation geometry and inference geometry 

As naturally expected, the distance notion of inference geometry ([T]) allows to define a statistical 
distance 'D{'d\9) between two members (two different distribution functions) of the parametric family 
([U, e.g.: considering the arc-length of the geodesies that connects the points 9 and 1} ^ V. According 
to asymptotic formula (1311) . this statistical distance is associated with the distinguishing probability 
of these distribution functions during a statistical inferential procedure. Conversely, the distance 
notion of fluctuation geometry ([3]) allows to define a statistical distance 'Dg{Ii\l2) between two sets of 
values of the stochastic variables I , which are described by a given member (a specific distribution 
function) of the parametric family ^. At first glance, the approximation formula (j8ip can be 
regarded as a counterpart expression of the asymptotic distribution (PT|) of inference geometry. 



D2(/|/) = -21og 



(99) 
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Such an analogy clarifies the relevance of the distance notion ^ : this second statistical distance 
is associated with the occurrence probability of a small fluctuation A/ — I — I around the point 
/ with maximum information potential S{I\6). Remarkably, the asymptotic formula ()8ip is the 
crudest approximation of the Riemannian gaussian representation (|80|) . According to this rigorous 
result of fluctuation geometry, the statistical distance S)e(/|/) is simply a measure of the relative 
occurrence probability in regard to the point with maximum information potential /: 

Mm. 

with uj{I\9) being the probability weight 

Although inference geometry and fluctuation geometry are two counterpart approaches, they 
provide different qualitative information about the statistical properties of a given parametric family 
^. On one hand, the first theory provides geometric information concerning to the inference of 
the control parameters of a given parametric family ^. On the other hand, the second theory 
provides geometric information about the fiuctuating behavior of the stochastic variables / for 
a given distribution function of the parametric family ([2]). Noteworthy that the term geometric 
information has a special meaning here: these geometric theories consider those properties of the 
statistical manifolds V and Aie that are independent on their specific coordinate representations 
TZe and TZj. Additionally, these two statistical geometries differ in regard to their application 
frameworks. Inference geometry only demands the continuous character of the statistical manifold 
V of control parameters 9. Therefore, this type of geometry can be introduced for a parametric 
family of distribution functions p{X\6) defined on a set of discrete variables X — {ATfe}: 

pix\0) ^ {piXk\e)\k e z} . (100) 

Conversely, fluctuation geometry only demands the continuous character of the manifold Me of 
stochastic variables /. Therefore, this geometry can be introduced for a continuous distribution 
function without control parameters: 

dp{I) ^ p{l)dl. (101) 

As expected, the simultaneous definition of inference geometry and fluctuation geometry is only 
possible for parametric families ([2]) deflned on continuous statistical manifolds M.e and V. 

A simple look to equations (p8)) and (|57|) allows us to realize that these geometric theories 
have a different amenable character. In particular, the metric tensor gapiO) of inference geometry 
(|28|) is very easy to obtain, either from the analytical or numerical calculation of these integrals. 
Conversely, the metric tensor gij{I\9) of fluctuation geometry should be obtained solving a set of 
covariant partial differential equations ([57]) . whose admissible solutions must obey certain boundary 
conditions. Actually, this latter mathematical procedure can be a hard task for a manifold A^g 
with a nontrivial geometry. However, once obtained the metric tensors gai3{9) and gij{I\9)^ the 
amenable character of these geometric theories changes in a radical way: fluctuation geometry 
turns much more amenable than inference geometry. For example, a modest mathematical effort 
has been devoted to arrive at the rigorous Theorems 2-4 and their associated implications as 
the Riemannian gaussian representation ((80| . Conversely, inference geometry has to deal with a 
very serious difficulty: there not exist a general way to relate a given parametric family ([2|) and 
the unbiased estimators 9 of its control parameters 9. Even the calculation of the distribution 
function (j30p of the efficient unbiased estimators 9eff is not a easy task. In particular, the 
asymptotic distribution (|3H) is a direct application of the central limit theorem, that is, an 
approximation formula for a statistical inference with a large but finite number m of outcomes 
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X = {l'^^\ I*-^-', ...I^™^}. A very important question during the historical development of inference 
geometry is the so-called higher- order asymptotic theory of statistical estimation [1], where a 
fundamental task is the improvement of approximation formula (|3ip considering a 1/m-power 
expansion based on the intrinsic geometry of the manifold V. A counterpart of Riemannian gaussian 
representation (|80p in the framework of inference geometry is unknown in the literature, at least, 
from the knowledge of the present author. 

6.2. On the notion of information entropy for a continuous distribution 

As discussed elsewhere [21| . the information entropy S [p\9] associated with the discrete distribution 
function (jlOOp is written as follows: 

s [p\o] = ~ ^p(^feie) \ogp{Xk\e). (102) 

Conceptually, information entropy is considered as a measure of the unpredictability associated with 
a random variable X . Interestingly, its counterpart extension for a continuous distribution function: 

dQ{I) = q{l)dl, (103) 

the so-called differential entropy: 

S^J^'^ [Q\Me] = - [ q{I) \ogq{I)dI, (104) 

JMe 

undergoes an important geometric inconsistence: its definition crucially depends on the coordinate 
representation TZi of the statistical manifold Aie: 

S!i!'^ [Q\Me] + Sf^^^ \Q\Me\ = - f g(e) log q{e)de. (105) 

JMe 

In general, the expectation values of scalar functions defined on the statistical manifold Mg are 
only independent on the coordinate representations: 

Ail) = A{e) ^ (Ail)) = {AiQ)) . (106) 

However, the probability density function q{I) is simply a tensorial density, whose values and 
general mathematical behavior crucially depend on the concrete coordinate representation TZj of 
the statistical manifold Me. Consequently, the consideration of the quantity Xc(/) — — log q{I) as 
a local measure of the information content is ill-defined from the geometric viewpoint because of it 
violates the requirement of covariance under the coordinate reparametrization 0(1) : TZi — >■ TZq of 
the statistical manifold Aig. Despite their apparent similarity, the differential geometry (|104[) is not 
a good generalization of the statistical entropy (|102l) for the framework of continuous distribution 
functions. For example, the differential entropy (J104I) does not obey other properties of its discrete 
counterpart (|102p . in particular, the positive definition 5* [p|0] > 0. 

An attempt to overcome some of the above inconsistences was developed by Jaynes [25]. 
According to this author, the correct formula for the information entropy of a continuous 
distribution function can be derived taking the limit of increasingly dense discrete distributions. 
Specifically, Jaynes proposed to start from of a set of n discrete points 5„ = {li} C Mg, which 
density jnil) = '^■""'^ 127=i ^ {I — h) approaches a certain function 7(/) in the limit n — >■ oo: 

7(/) = lim 7„(/). (107) 
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The density function j{I) is referred to as the invariant measure. Combining the previous argument 
and the discrete definition of information entropy (|102l) . this author arrived at the following 
correction for the differential entropy: 



Sl[Q\Me] = -j g(/)lo 

JMa 



At first glance, the present formula is similar to but conceptually distinct from the (negative of the) 
Kullback-Leibler divergence 20 : 

'^{ly 



Dkl{Q\p)= I g(/)lo, 

JMe 



P{1) 



dl. (109) 



As many other divergences considered in statistics [20 , Kullback-Leibler divergence (|109p is a 
measure of the separation of a distribution function dQ{I) = q{l)dl to a reference distribution 
dP{I) = p{l)dl. In the formula (llOSp . however, the invariant measure 7(1) need not to be a 
probability density, but simply a density. In particular, it need not satisfied the normalization 
condition: 

/ 7{I)dI^l. (110) 

JMe 

Although Jaynes' differential entropy (|108p is invariant under coordinate reparametrizations, 
the success achieved with this correction formula is only partial. In fact, Jaynes was unable 
to provide a general criterium to precise the invariant measure 7(1) for a concrete application. 
Referring to this ambiguity, he recognized that . . . the following arguments can be made as rigorous 
as we please, but at considerable sacrifice of clarity [22j . Remarkably, the pre-existence of a 
Riemannian structure defined on the statistical manifold Me introduces a natural choice for the 
invariant measure 7(/). While the probability density q{I) of a distribution function dQ{I) depends 
on the coordinate representation TZi, the notion of probability weigh\^. 



qg{I) = q{lW\27rg'^^{I\e)\ (111) 

represents a scalar function defined on the statistical manifold A4e. Using the probability weight 
qg{I) instead of the probability density q{I), the quantity 3c{I\6) = —logqg{T) can be introduced 
as a local invariant measure of the information content. Thus, a more appropriate generalization 
of information entropy for a continuous distribution function is given by: 

Sl[Q\Me] ^ (Mm) ^ - f qg{I)\ogqg{I)d^,{I\e), (112) 

JMg 

where the index g denotes the Riemannian structure of the statistical manifold Me. Noteworthy 
that equation (|112p is a particular case of Jaynes' differential entropy pOSp . where the invariant 
measure 7(1) is determined by the metric tensor gij{I\9) of the statistical manifold Me- 



7(/) = yWW2^ ^ l/V\27^9'Hmi (113) 

According to Jaynes' argument, the invariant measure (J113I) can be obtained as the limit of 
increasingly dense subset of points 5„ that are uniformly distributed on the statistical manifold 
Me, that is, a distribution function whose probability weight 7g(/|6') = 1. 

The geometric differential entropy (|112p depends both on the distribution function dQ{I) 
as well as the Riemannian structure of the statistical manifold Me- According to postulates of 

+ The notion of probability weight was considered in equation 1 148 I I to introduce the information potential S{I\d). 
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fluctuation geometry, the Riemannian structure of the statistical manifold Me is associated with a 
reference distribution function dp{I\9), specifically, the distribution function (j36p derived from the 
knowledge of the information potential S{I\6). Therefore, it is worth to distinguish between two 
different notions of differential entropy: 

• The differential entropy S^^ [Q\Aig] of an arbitrary distribution function dQ{I) defined on a 
statistical manifold A^e, which is endowed of a pre-existent Riemannian structure. 

• The notion of intrinsic differential entropy S^^ [Me] of a statistical manifold Me, that is, 
the differential entropy S^^ [p\Me] of the distribution function dp(I\9) associated with the 
Riemannian structure of the statistical manifold Me- 

The intrinsic differential entropy iSf^ [Me] of the statistical manifold Me is given by the 
negative of the expectation value of the information potential S{I[9): 

Sl[Me]^-{S{m), (114) 

which can be rewritten using equation (j74p as follows: 

Sl[Me] = l{T>Um)~V{e). (115) 

Accordingly, S^^ [Me] is a global geom,etric m,easure of an statistical manifold Me, which depends 
on its topological properties and Riemannian structure, as well as the position of the point / with 
maximum information potential. In particular, if the statistical manifold Me can be decomposed 
into two independent Riemannian manifolds A and B as Me = A <S) B, its intrinsic differential 
entropy S^^ [Me] is additive: 

Sl[Me]^S^,-[A]+SrAB], (116) 

where g^ and g^ denote their respective Riemannian structures. 

Before we end this section, it is worth remarking that the requirement of covariance is a 
strong constraint. The existence of this symmetry in the differential entropy (J112p implies that 
diffeomorphic distribution functions exhibit the same value of their intrinsic differential entropies. 
As already commented, all distribution functions illustrated in figure [2] are diffeomorphic. In 
fact, their statistical manifolds Me are diffeomorphic to the one-dimensional Euclidean manifold 
E. Thus, their respective intrinsic differential entropies exhibit the same value S^^ [E] = 1/2. 
This result is easy to obtain using the gaussian distribution ([57)1 associated with the coordinate 
representation TZg of the statistical manifold Me- Moreover, the n-dimensional manifold Me 
associated with a distribution function obeying the decomposition (j95[) has an intrinsic information 
entropy S^^ [E"] — n/2, which is a direct consequence of the property ()116p 

6.3. Riemannian reformulation of Einstein fluctuation theory 

Classical fluctuation theory starts from Einstein postulate [23j : 

dp{I\e) = Ae^^^^'^^dl, (117) 

which describes the fluctuating behavior of a set of macroscopic observables / in an equilibrium 
situation driven by certain control parameters 60. Here, S{I[S) denotes the entropy of a closed 
system, while yl is a normalization constant. Since the function p(I[S) = ^e'^'^-^'^^ is a tensorial 

* The Boltzinann's constant k is set as the unity, fc = 1. 
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density, the entropy S{I\9) considered in Einstein postulate (|117p behaves under a coordinate 
reparametrization 6(/) : TZi — > 7?.e as follows: 

99 



S{e\0) = S{I\9) - log 



dl 



(118) 



and hence, it is not a scalar function. This feature is a direct contradiction with the thermodynamic 
relevance of entropy as a state functiori^. Consequently, Einstein postulate (J117I) is ill- defined from 
the geometric viewpoint. 

As already discussed in a recent paper [TBI, the previous inconsistence disappears when one 
redefines Einstein postulate into the covariant form ([5S|) . Thus, the entropy of a closed system should 
be identified with the information potential S{I\6) of fluctuation geometry up to the precision 
of an additive constant. As expected, the covariant distribution function (I36|) also depends on 
the Riemannian structure of the statistical manifold A^g of the macroscopic observables /. Such 
an ambiguity disappears considering the relationship between the metric tensor gij{I\9) and the 
negative of the covariant Hessian of the scalar entropy S{I\9), equation (j5ip . Consequently, the 
system fluctuating behavior is fully determined by the knowledge of the entropy S{I\9). From the 
physical viewpoint, the constraint (ICTI) between the metric tensor gij{I\9) and the entropy S{I\9) 
is very relevant. In fact, such a choice ensures the geodesic character of the system hydrodynamic 
equations [18] : 

^ = -\m- (119) 

as 
Here, the parameter s denotes the arc-length of the curve of hydrodynamic relaxation /(s) € Aie- 
This curve can be characterized by the tangent vector field ^(s) with contravariant components: 

fW = ^, (120) 

as 

which is unitary vector field, gij(I\9)£,^{s)£^^ (s) = 1. The tangent vector field £,{s) is oriented in 
the same direction of the generalized restituting force C(^l^) with covariant components (^i{I\9) = 
dS{I\9)/dr = —4!i{I\9). As expected, the generalized restituting force is directed towards the 
equilibrium configuration / (the point with maximum entropy), whose direction is determined by 
the negative of the unitary vector field v'^{I\9) introduced in equation ([75t . As already shown 
in the proof of Theorem 4, the unitary vector field v^{I\9) obeys geodesic differential equations 
(j52p . In physics, the geodesies are regarded as the natural motions in theories with a Riemannian 
formulation, as example, in general relativity theory. This nontrivial result establishes an interesting 
connection of the present Riemannian reformulation of equilibrium fluctuation theory and its 
possible nonequilibrium generalization. 

6.4^. Relation with Ruppeiner's geometry of thermodynamics 

Ruppeiner proposed in the past a Riemannian geometry for thermodynamics [24 . Despite its 
large history in the literature, Ruppeiner's geometry undergoes some inconsistencies. The most 
important is that this approach assumes the scalar character of entropy S{I\9) employed in Einstein 
postulate (|117p . Ruppeiner recognized himself the restricted applicability of this consideration 

J A state function is a property of a system that depends only on its current state, not on the way in which the system 
acquired that state. Geometrically, the value of a state function should not depend on the coordinate representation 
employed to describe that state, that is, it should be a scalar function. 
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(see in subsection II. B in his paper in Review of Modern Physics [24j)- However, he justified its 
application as a good approximation in the framework of large thermodynamic systems, whose 
fluctuating behavior can be described within the gaussian approximation: 



dp{I\I) ~ exp 



1 



g,,iI)APAP 



9^jii) 



2tt 



dl. 



(121) 



Here, AP = P — P denotes a small deviation from the most likely (equilibrium) state /, while 
gij{I) represents the thermodynamic metric tensor |24j : 



9ijil) 



d^S{I\B) 



(122) 



dPdP ■ 

Notice that the thermodynamic metric tensor is expressed here in terms of the most likely state 
/ instead of the control parameters 6. Apparently, the goal of this consideration is to justify the 
relevance of the distance notion: 



ds\ 



g^j{I)dPdP (123) 

as a thermodynamic distance between equilibrium states. The previous consideration, however, 
has a restricted applicability even in the framework of large thermodynamic systems. Since the 
function p{I\0) — ^e'^'^^'*^/'"' is a tensorial density, the probability distribution ()117p can exhibit 
two or more maxima / for certain values of control parameters 9. Consequently, it is not possible 
to guarantee the bijective correspondence between the control parameters 9 and the equilibrium 
states / (the most likely values of macroscopic observables). This type of situations is associated 
with the phenomenon of ensemble inequivalence in statistical mechanics, which is observed during 
the occurrence of discontinuous phase transitions ;23^ . 

Fluctuation geometry successfully overcome the previous limitations of Ruppeiner's geometry 
|18| . As already commented, the key considerations are (i) to assume the covariant redefinition of 
Einstein postulate (p6l) to ensure the scalar character of the entropy S{I\9); and (ii) to generalize 
the thermodynamic metric tensor (|122|) considering the scalar entropy S{I\9) and the covariant 
differentiation Dt: 

d^S{I\9) 



9ij{I) 



g,j{I\9) = -D,DjS{I\9). 



(124) 



dPdP 

From this viewpoint, Riemannian reformulation of Einstein fiuctuation theory arises as a formal 
improvement of Ruppeiner's geometry. Rewriting the previous relation as follows: 



9^Am 



d^S{I\9) 



n,ii\e) 



dS{I\9) 



(125) 



it is easy to check that the metric tensor gij{I\9) looks like the thermodynamic metric tensor 
(|122[) at the point / with maximum entropy, where dS{I\9)/dP^ — 0. However, the existence 
and uniqueness of the point / is now guaranteed by Theorem 3. Moreover, the metric tensor 
gij{I\9) of fluctuation geometry is well-defined for any macroscopic state / £ M.b- While 
gaussian approximation (|12ip is only applicable to large thermodynamic systems with a small 
fluctuating behavior, Riemannian gaussian representation (I80p is a rigorous result. Noteworthy 
that the application of fluctuation geometry in classical fluctuation theory cannot be regarded 
as a Riemannian approach of thermodynamics, but a Riemannian approach of classical statistical 
mechanic a]]y 



ft Thermodynamics is a macroscopic physical theory that disregards the incidence of fluctuations. 
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7. Final remarks and open problems 

Fluctuation geometry was proposed in this work as a counterpart approach of inference geometry. 
This new geometry allows the introduction of the distance notion ds^ = gij{I\0)dPdP to 
characterize a statistical distance between two different values of the stochastic variables /, whose 
behavior is described by a member of a parametric family of continuous distribution functions 
dp[I\9) = p{l\9)dl. The metric tensor gij{I\9) has been derived starting from a set of axioms, 
which lead to a set of covariant differential equations (j57p written in terms of the probability 
density p{I\9). The main consequence is the possibility to rephrase the probability description in 
terms of purely geometric notions, as the case of Riemannian gaussian distribution (|80p. Thus, the 
statistical description can be equivalently performed using the language of Riemannian geometry, 
and hence, fluctuation geometry represents an alternative framework for applying the powerful tools 
of differential geometry for the statistical analysis. As already evidenced, the present approach leads 
to a reconsideration of the notion of information entropy for a continuous distribution as well as 
the Riemannian reformulation of Einstein fluctuation theory. 

Before we end this section, let us comment some open problems that deserve a special attention 
in future works. Firstly, it is important to clarify the existence and uniqueness of the solution of 
the covariant differential equations (1571) . As already evidenced, postulates of fluctuation geometry 
allow a univocal determination of the Riemannian structure of the statistical manifold Mq for the 
application examples discussed in this work, which exhibit a trivial Euclidean (flat) geometry. Thus, 
it is necessary to check if such an existence and uniqueness are preserved in a statistical manifold 
Mg with a more complex Riemannian structure. 

Secondly, one expects that the curvature notion of manifold AAg should play a fundamental role 
from the statistical viewpoint. Some basic arguments suggest that curvature should be associated 
with the notion of statistical correlations. Both curvature notion and the statistical correlations, 
as example, can only be defined when the dimension n of the statistical manifold AAg is equal 
or larger than two. Interestingly, the existence of a decomposition (|95|) in a parametric family 
implies the fiat character of the statistical manifold M.0. This possible relevance of the curvature 
notion of the statistical manifold Me is consistent with some physical analogies. General Relativity 
theory, as example, identifies gravitational interaction with the curvature of the space-time M**. 
In the framework of statistical theories as quantum mechanics, the statistical correlations can be 
regarded as the counterpart of interactions. The gas of non-interacting particles obeying Fermi- 
Dirac statistics, in particular, manifests effective repulsion forces as consequence of the inter-particle 
correlations associated with Pauli's exclusion principle. By analogy, a non-vanishing curvature 
of a statistical manifold as AAg would be associated with the existence of irreducible statistical 
correlations. The analysis of this conjecture will be the main interest of a forthcoming paper. 

A third question is to analyze how deep is the analogy between inference geometry and 
fluctuation geometry. Specifically, it is natural to wonder if each result obtained in one of these 
theories has a counterpart relation in the other theory. Gaussian approximation (1811) . as example, 
can be regarded as a counterpart result of the asymptotic distribution pip of inference theory. 
Starting from the fact that gaussian distribution (|81l) admits the exact improvement ((80|) . the 
underlying analogy strongly suggests the following improvement: 



rfQ'"('?l^) = ^exp 



\^\^) 



^^-^^^^di) (126) 



2n 

for the asymptotic distribution (|3T|) of inference theory. Here, £(■)?) should represent the distance 
'£){'&, 9) between the points d and 9 calculated with the metric tensor gai3{9) defined on the statistical 
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manifold V of control parameters 0. It would be interesting to analyze this conjecture. Finally, 
it would be interesting to analyze other implications of fluctuation geometry in the framework 
of information theory |21j . A question with a special interest is the application of the notion of 
intrinsic differential entropy to the problem of maximum information distributions [25] . 

Appendix A. Derivation of the reparametrization function s{I\9) 

Let us denote by si and S2 the coordinates that correspond to the boundary points Imin and Imax 
in the new coordinate representation TZs- The integration of equation ()87|) yields the following 
relation: 

Here, p{I\0) represents the cumulant distribution function (1551) and $(2) the function (|90p . 
Moreover, it was taken into account that the normalization condition of the distribution function 
dp{I\0) implies the relation Z{6) = $(32) — $(si)- Introducing the inverse function $^^(2), the 
reparametrization function s{I\9) : TZi -^ TZs can be expressed as follows: 

s{I\e)^<i>-^[4> + <7p{I\9)], (A.2) 

where (p — ^{si) and a — $(52) — ^(si). Considering the expression (j74p . the information potential 
S{I\9) can be expressed as follows: 

Sim = - log a -^s\l\9). (A.3) 

Both the reparametrization function (jA.2[) and the information potential (jA.3[) depend on the 
nonnegative constants (j) and a, whose values are determined by the boundary points si and S2 
in the coordinate representation TZs- However, a careful analysis reveals that these parameters 
cannot admit arbitrary values. Firstly, one should notice that the information potential (jA.3[) is 
everywhere finite when |si^2| < 00 ■ Taking into account the relationship between the information 
potential S{I\9) and the probability density p{I\9): 



piI\9)^e^p[S{I\9)] 



giiim 



2tt 



(A.4) 



the vanishing of the probability density p{I\9) at the boundary dAie of the statistical manifold A4g 
(Axiom 4) implies the vanishing of the metric tensor gii{I\9) on the boundary dAie- However, 
the metric tensor gii(I\9) should be non-vanishing everywhere, even, on the boundary dMo of the 
statistical manifold A4e- The existence of the contravariant metric tensor g'^{I\9), in particular, 
demands the non- vanishing of the metric tensor determinant |gij(/|0)|. For an one dimensional 
manifold A4e, this last requirement implies < |.gii(/|6')| < 00. The only way to fulfil such a 
requirement is to impose the constraints cf) = and cr = 1 <=> ,si = — cxd and S2 ~ +cxd, which leads 
to equation 
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